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IMAGE PROCESSING USING VECTORS 

This invention is directed to a method of image segmentation using a vector 
quantization approach. 

5 The aim of the algorithm is to partition the picture into a fixed number of 

segments, taking into account the location and value of each pixel. For moving 
sequences, we additionally desire a smooth transition in the segmentation from 
one picture to the next. 

10 We start by representing the picture in multidimensional space, which we shall 
call the universe. This is the product of the ordinary two-dimensional space of 
the picture, which we shall call the canvas, and the pixel space, which is the 
space in which the pixel values themselves are defined. For example, if we are 
segmenting an RGB picture, the pixel space will have three dimensions and the 

1 5 universe will therefore have five dimensions. 

The picture is represented as a set of points in the universe, one for each pixel. 
The co-ordinates of the pixel in the universe describe its position on the canvas 
together with its value in pixel space. 



The segmentation is simply a partitioning of the set of pixels into a fixed number 
of subsets. 

The algorithm works as follows. We start with some initial segmentation of the 
25 picture into the desired number of segments. At the very start of a sequence, 
this might correspond to a straightforward division of the canvas into equal areas. 
Subsequently, the initial segmentation may be the segmentation of the previous 
picture. We then perform two steps. 



20 
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First, we calculate the centroid of each segment, according to the values of the 
segment's pixels in the universe. 

We then reassign each pixel to the segment with the closest centroid, where 
5 closeness is measured by the Euclidean distance in the universe. 

These two steps may be repeated once or more using the same picture. The 
result is a new segmentation; which may be used as the initial segmentation for 
the next picture. 

10 

A segmented version of the picture can be created by replacing the pixel values 
in each segment with the projections of the segment centroids onto the pixel 
space. 

1 5 The algorithm has the following features, each of which may be optionally 
combined with any of the others: 



§ all segments have the same mass. 

§ * the segmentation operates recursively, starting from an initial 
20 segmentation 

§ pixels are assigned to clusters and then the clusters are moved to reflect 
their new membership 

The algorithm performs more consistently than previous approaches" 

25 

The algorithm can actually be considered as an adaptive vector quantization 
process. The universe is the vector space, and the codebook is the set of 
segment centroids. The codebook is updated according to the data being coded. 
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Vector quantization helps to explain the algorithm as a compression technique in 
which the goal is to describe the information using a limited set of representative 
points in the universe. 

5 We now turn to some details of the algorithm. 

The performance of the algorithm depends on the relative scaling of the co- 
ordinate axes in the universe. For example, if on the one hand the canvas co- 
ordinates are multiplied by a large scaling factor, the algorithm will tend to pay 
1 0 more attention to the spatial component of the overall Euclidean distance and the 
final segmentation will look more like a simple partitioning of the canvas. If on 
the other hand the pixel space co-ordinates are given a large scaling factor, the 
segmentation will be dominated by pixel values and each segment will be spread 
across the canvas, giving a result closer to simple histogram analysis. 

15 

Good results have been obtained by setting the relative scaling factors of the co- 
ordinate axes so as to equalize the variances of the pixels in all the dimensions 
of the universe. An exception to this rule is that the canvas co-ordinates are 
arranged to have equal scaling determined by the wider (usually horizontal) co- 
20 ordinate. Setting the scaling according to variance is equivalent to using the so- 
called Mahalanobis Distance, as described for example in 
http://www.engr.sjsu.edu/~knapp/HCIRODPR/PR_Mahal/M_metric.htm. 
in the special case where the pixel co-ordinates in the universe are assumed to 
be uncorrected. 



in order to minimize some error measure. This is difficult, as the obvious error 
measures available are directly dependent on the scaling. In one embodiment, 
the overall vector quantization is expressed as the product of the errors in each 
co-ordinate, with the constraint that all the scaling factors sum to a constant. 



25 



Other embodiments employ methods of dynamically varying the relative scaling 



In the algorithm, the number of segments is a parameter chosen by the user. In 
alternatives, the number of segments is chosen as a function of the input data. 
For example, the number of segments may be chosen so that the variance of the 
overall vector quantization error approaches a predetermined constant value. 

5 

It is possible that a segment may disappear after some frames if its centroid turns 
out to be further from every pixel than some other segment. In many cases, this 
is desirable behaviour, as when an object disappears from the screen. In other 
embodiments, the mechanism may allow for the introduction of new segments. 
1 0 Particular algorithms may establish criteria for splitting an existing segment into 
two, and possibly also for merging segments that end up being close together in 
the universe. The criteria could be based on the variance of the vector 
quantization error, or on the kurtosis of the distribution of the pixels in each 
segment as a measure of bimodality. 

15 

Another approach to deciding whether to add or remove segments is to run two 
or more parallel versions of the algorithm with different numbers of segments and 
to base the decision on the difference in overall error between the two versions. 

20 One possible solution to the problem of the disappearance and reappearance of 
objects because of global motion in the scene is to impose a toroidal structure on 
the canvas. This is done by stitching the left edge to the right edge and the top 
edge to the bottom edge. Centroids that disappear off one edge will now 
reappear at the opposite edge and will be available for re-use. Care needs to be 

25 taken in the distance definition to make sure that the shortest distance is used. 

Other variations on the basic Euclidean distance are also possible. One 
possibility, which is simpler to implement in hardware, is the Manhattan distance 
or norm. This is the distance between two points measured by walking 
30 parallel to the co-ordinate axes. Another possibility is to take the maximum of the 
differences measured along the co-ordinate axes. - 
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Certain embodiments use a pixel space in which the co-ordinates are simply the 
luminance, YUV or RGB values of the pixels. Others look at segmentation on the 
5 basis of motion vectors and this is described in a separate section below. 

There are many other possibilities for features that can be included in the pixel 
space, for example local measures of noise or texture, or (in the case of the Delft 
presentation referred to above) Gabor jets. For this reason, the pixel space (or 

i 

10 the whole universe) is sometimes referred to as the feature space. 

The description so far assumes that each pixel belongs to just one segment, so 
that assignment of pixels to segments is a hard decision. It is possible to replace 
this with a soft decision, in which each pixel carries a set of probabilities of 

15 membership of each segment. The output of the algorithm can be based on a 
hardened version of the decision (assigning the segment with the highest 
probability) while the soft decision is retained for the recursive processing. 
Alternatively, the soft decision may have significance in the output of the 
algorithm. For example, if the pixel space consists of motion vectors, the output 

20 may consist of several motion vectors assigned to each pixel, each with a relative 
weight. 

The process of calculating centroids and assigning vectors can be repeated 
several times per picture. Performing several iterations is necessary if the aim is 
25 to segment a single picture starting from trivial initial conditions. However, I have 
found that for moving sequences there may be little to be gained from performing 
more than one iteration per picture. 

The following describes additional features of the algorithm in the case where the 
30 pixel space includes motion or displacement vectors. 
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The algorithm can be used to segment a motion vector field resulting from 
another process such as Phase Correlation (PhC). Segmentation of the motion 
vector field fits very well with a model in which a picture consists of distinct 
objects, each with its own speed and direction of motion. However, it does not 
5 • work so well if the objects are rotating, receding or approaching or if their 
distance from the camera is not uniform. A better description of the motion in 
each object can be obtained by replacing the basic motion vector with an affine 
transform, in which the horizontal and vertical motion vector components are 
each allowed to vary linearly with the spatial co-ordinates of the pixel on the 
10 canvas. 

The affine motion model can be applied to the segmentation algorithm by 
expressing each segment centroid as a set of six parameters describing an affine 
transform. The pixel-space component of the distance metric then becomes the 
distance between the motion vector at the point concerned and the affine model 
evaluated at that point. 

Motion vectors give rise to the possibility of an additional component in the 
distance metric, based on the error in the pixel domain when the motion vector is 
applied to a picture. This displaced frame difference can be incorporated into the 
distance metric with appropriate scaling. The result is a segmentation of the 
motion vector field that takes into account the fidelity of the motion compensated 
prediction. 

Some features of the performance of the algorithm: 

§ the variation in the segmentation from one frame to the next is small. This 
is difficult to achieve with segmentation algorithms that do not use recursion 
§ with the scaling chosen, the segments are not necessarily contiguous but 
the parts of each segment remain close together. 
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Simulations show that the property of smooth variation is quite reliable and that 
the segmentation is good provided a suitable number of segments is chosen. 
This "suitable number" may be chosen as sequence dependent. 

5 Replacement of "raw" PhC motion vectors by a segmented version can improve 
the overall quality of the motion vector field. This is true on occasions of the 
version based on affine transforms. 

The algorithm performs fast and (subjectively) very well on demanding picture 
10 material. 

It will be appreciated by those skilled in the art that the invention has been 
described by way of example only, and that a wide variety of alternative 
approaches may be adopted. 

15 



